Parallel and Distributed Closed Regular Pattern Mining in Large Databases

نویسندگان

  • M. Sreedevi
  • L. S. S. Reddy
چکیده

Due to huge increase in the records and dimensions of available databases pattern mining in large databases is a challenging problem. A good number of parallel and distributed FP mining algorithms have been proposed for large and distributed databases based on frequency of item set. Not only the frequency, regularity of item also can be considered as emerging factor in data mining research. Current days closed itemset mining has gained lot of attention in data mining research. So far some algorithms have been developed to mine regular patterns, there is no algorithm exists to mine closed regular patterns in parallel and distributed databases. In this paper we introduce a novel method called PDCRP-method (Parallel and Distributed closed regular pattern) to discover closed regular patterns using vertical data format on large databases. This method works at each local processor which reduces inter processor communication overhead and getting high degree of parallelism generates complete set of closed regular patterns. Our experimental results show that our PDCRP method is highly efficient in large databases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Regular-Frequent Pattern Mining in Large Databases

Mining interesting patterns in various domains is an important area in data mining and knowledge discovery process. A number of parallel and distributed frequent pattern mining algorithms have been proposed so far for the large and/or distributed databases. Occurrence frequency is not the only criteria to mine the patterns but also occurrence behavior (regularity) of a pattern may also be inclu...

متن کامل

Closed Regular Pattern Mining Using Vertical Format

Discovering interesting patterns in transactional databases is often a challenging area by the length of patterns and number of transactions in data mining, which is prohibitively expensive in both time and space. Closed itemset mining is introduced from traditional frequent pattern mining and having its own importance in data mining applications. Recently, regular itemset mining gained lot of ...

متن کامل

Mining Closed-Regular Patterns in Incremental Transactional Databases using Vertical Data Format

Regular pattern mining on Incremental Databases is a novel approach in Data Mining Research. Recently closed item set mining has gained lot of consideration in mining process. In this paper we propose a new mining method called CRPMID (Closed-regular Pattern Mining on Incremental Databases) with sliding window technique using Vertical Data format. This method generates complete set of closed-re...

متن کامل

A High Performance Distributed Tool for Mining Patterns in Biological Sequences

The identification of interesting patterns (or subsequences) in biosequences has an important role in computational biology. Databases of genomic and proteomic sequences have grown exponentially, and therefore pattern discovery is a hard problem requiring clever strategies and powerful pattern languages to achieve manageable levels of efficiency. As far as we are aware of, known tools are eithe...

متن کامل

Distributed Algorithm for Frequent Pattern Mining using HadoopMap Reduce Framework

With the rapid growth of information technology and in many business applications, mining frequent patterns and finding associations among them requires handling large and distributed databases. As FP-tree considered being the best compact data structure to hold the data patterns in memory there has been efforts to make it parallel and distributed to handle large databases. However, it incurs l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013